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ABSTRACT 

Studies of the peculiar velocity bulk flow based on different tools and datasets have 
been consistent so far in their estimation of the direction of the flow, which also hap- 
pens to lie in close proximity to several features identified in the cosmic microwave 
background, providing motivation to use new compilations of type-la supernovae mea- 
surements to pinpoint it with better accuracy and up to higher redshift. Unfortunately, 
the peculiar velocity field estimated from the most recent Union2.f compilation suffers 
from large individual errors, poor sky coverage and low redshift- volume density. We 
show that as a result, any naive attempt to calculate the best-fit bulk flow and its 
significance will be severely biased. Instead, we introduce an iterative method which 
calculates the amplitude and the scatter of the direction of the best-fit bulk flow as 
deviants are successively removed and take into account the sparsity of the data when 
estimating the significance of the result. Using 200 supernovae up to a redshift of 
z=0.2, we find that while the amplitude of the bulk flow is marginally consistent with 
the value expected in a ACDM universe given the large bias, the scatter of the di- 
rection is significantly low (at > 99.5% C.L.) when compared to random simulations, 
supporting the quest for a cosmological origin. 
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1 INTRODUCTION 

In the last couple of decades a considerable effort has been 
devoted to the analysis of the peculiar velocity field in search 
for an overall bulk flow (BF) on ever increasing scales, lately 
reaching as high as ~ 100Mpc//i using galaxy surveys 
[l6| and type-la supernovae (SNe) [l7H23l | and even an or- 
der of magnitude higher, based on measurements of the 
kinetic Sunyaev-Zeldovich effect in the cosmic microwave 
background (CMB) [H-El]- While there have been conflict- 
ing claims regarding the amplitude of the dipole moment of 
this field and its tension with the expected value in a ACDM 
universe, the vast majority of these surveys have been con- 
sistent in their findings for the direction of the dipol«B 

Meanwhile, several features in the CMB temperature 



maps from the COBE DMR [3J and WMAP |32j exper- 
iments have been identified in roughly the same region of 
the sky, from the dipole moment [33l [ to several reported 
anomalies, including the alignment between the quadrupole 
and octupole [13, |35|] , mirror parity [3^ - [3^ | and giant rings 
[39l ]. This coincidence provides further motivation to search 
for a unified cosmological explanation [40| . 

Over the years a number of cosmological scenarios have 
been suggested as possible sources for a peculiar vel ocity 
BF, such as a great attractor QLQ, a super-horizon tilt 
over-dense regions resulting from bubble collisions [42 
induced by cosmic defectfl HI, 13, etc. In an attempt to 
test these hypotheses and distinguish between them, any 
knowledge regarding the redshift dependence of the BF can 
be a crucial discriminator. 



1 Using type-la SNe, [3(jll recently found that the direction of 
highest cosmic expansion rate is also in the vicinity of this dipole, 
though it is consistent with the expectation from ACDM. 



2 An over-density induced by a pre-inflationary particle would 
imprint giant rings in the CMB whose center is aligned with the 
BF [H|. 
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Type-la SNe, whose simple scaling relations provide em- 
pirical distance measurements and which have been detected 
up to redshifts z > 1, provide a unique tool to estimate the 
peculiar velocity BF and study its direction and redshift ex- 
tent. However, this approach also contains certain caveats. 
First, datasets from typical type-la SNe surveys are orders 
of magnitude smaller than those from galaxy surveys and 
their sky coverage and redshift- volume density are extremely 
poor. Secondly, different SNe compilations often use differ- 
ent light-curve fitters, involving different nuisance param- 
eters. Currently, the most promising candidate for a large 
scale BF search is the Union2.1 compilation [45| (see also 
@i E3])> comprising of 19 different surveys which are all 
analyzed with a single light-curve fitter (SALT2 [4^]) 

The purpose of this work is to investigate the peculiar 
velocity field extracted from the Union2.1 data and given 
its limitations determine which conclusions can be reliably 
made as to the BF in the inferred radial peculiar velocity 
field, placing an emphasis on its direction and redshift ex- 
tent. Accounting for the substantial bias due to the sparsity 
of the data and using a dedicated algorithm to iteratively 
remove outlying data points from the analysis, we test the 
amplitude and the scatter of the direction of the BF and 
estimate the significance of the results using Monte Carlo 
simulations. 

The paper is organized as follows. In Section 2 we de- 
scribe the initial filtering of the data and the method for 
extracting the individual radial components of the peculiar 
velocities, as well as how we generate random simulations 
of data with the same spatial distribution. In Section 3 we 
address the inevitable bias due to sparsity in both the ampli- 
tude and direction in naive best-fit methods used to detect 
an overall BF. We introduce our method in Section 4 and 
define a score which measures the scatter of the best-fit di- 
rection in successive iterations. We demonstrate that this 
score is effective in identifying simulated datasets with an 
artificially inserted BF and discuss how the significance of its 
findings can be estimated. In Section 5 we consider both the 
full dataset and the application of the scatter-based iterative 
method to the data and present the results. We conclude in 
Section 6. 



2 PRELIMINARIES 
2.1 Data filtering 

We use the recent type-la SNe compilation Union2.1 [4^ . 
which contains 580 filtered SNe at redshifts 0.015 < z < 1.4. 
This compilation is drawn from 19 datasets, all uniformly 
analyzed with a single light-curve fitter (SALT2 Q), and 
analyzed in the CMB-frame. At high redshifts, the spatial 
distribution of this dataset grows increasingly sparse, the 
individual errors become large and some of the induced ra- 
dial peculiar velocities, calculated as described below, take 
on unreasonable values (such as > 0.5c). In order to avoid 
these pathologies while still retaining the ability to examine 
the behavior at distances larger than those accessible with 
galaxy surveys (< 100Mpc//i), we apply an initial cutoff in 
redshift and remove all points with z > 0.2 (corresponding 
to < 550 Mpc//i) from our dataset. 

In Fig.[T]we plot the spatial distribution of the Union2.1 




Figure 1. The spatial scatter of all SNe in the Union2.1 compila- 
tion. The red triangles indicate SNe with z > 0.2 that are filtered 
out before any analysis is performed. 



dataset with the remaining 200 SNe marked in blue. It is ap- 
parent that even outside the galactic plane the sky coverage 
is quite poor and the three-dimensional distribution of the 
remaining data is significantly sparse and inhomogeneous. 
The implications of this will be discussed in the next sec- 
tion. 



2.2 Peculiar velocities and best-fit bulk flow 

The Union2.1 dataset specifies for each SN its measured 
redshift z, the inferred distance modulus /i Q b s and the er- 
ror A/x bs. The relation between the cosmological ("free") 
redshift z and the distance modulus is given by 



V = 51og 10 



d. 



-(in 



+ 25, 



(1) 



1 Mpc J 

where <1l is the luminosity distance, which in a flat universe 
with matter density Qm, a cosmological constant Qa and a 
current Hubble parameter Ho, is 

r dz' 

Jo 



d L (z) = 



Ho 



(2) 



Due to the peculiar velocity, both the observed redshift and 
distance modulus (through the luminosity distance) will dif- 
fer from their true cosmological values |49j]. To first order in 
v • n, where n is the direction pointing to a SN with peculiar 
velocity v, we get 



1 + z = (l + z)(l + v-n) 
d L {z) = d L (z)(l + 2v -h). 



(3) 



In order to extract the radial peculiar velocity v r of the SNe 
in our dataset, we follow the first order expansion in [2ol.[49| 



In 10 H(z)d A (z) 
5 l-H(z)d A {z 



■(/iobs -M(*0)> 



(4) 



where H(z) is the Hubble parameter at redshift z, and 
d A (z) = dz J (z)/(l + z) 2 is the observed angular diameter 
distance to the SN. 

We then find the best-fit BF velocity vbp in our set of N 
SNe, each with a radial velocity amplitude v l r in a direction 
&i, by minimizing 

v B p • fii) 2 



X 2 (vbp) = 



(5) 



(A«£) 2 

with respect to the direction and amplitude of vbf, where 
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Avl, are the individual errors obtained from the measure- 
ment errors in the distance moduli A^ bs using Eq. Q. 

2.3 Monte Carlo simulations 

Our Monte Carlo simulations consist of random permuta- 
tions of the sky locations of the SNe in our dataset, after 
removing the initial BF velocity VBF init from the entire set 
by subtracting its corresponding component from the indi- 
vidual velocities 

v r — ► v' r - v B F init • ni. (6) 

The new dataset will have the same spatial distribution and 
its own initial random BF with a typical » rms amplitude. 
In order to simulate a random realization with a specifi- 
cally chosen cosmological BF (up to statistical noise), for 
the purposes of testing our method, we simply add the cho- 
sen non-random BF contribution to the individual velocities 
after the permutation. 

For the analysis in this paper we use 16, 000 random 
realizations (spatial permutations) of the Union2.1 data 
with no artificially inserted BF in order to test against 
the null hypothesis. To examine the detection capabili- 
ties of our method, we use 6, 000 different random realiza- 
tions for each inserted BF amplitude in the range |vbf| = 
{50, 100 .. . 450 km/s}, all in the direction (I, b) = (295°, 5°), 
which is the direction of the best-fit BF on the full dataset 
(for the purposes of estimating the significance of our re- 
sults, we have verified that this specific choice of direction 
has no effect). 



3 SPARSITY BIAS 

As mentioned above, the spatial distribution of the SNe 
dataset is inhomogeneous and sparse across the sky and in 
redshift depth. As a consequence, any search for an overall 
BF will be severely biased. Such a bias must be taken into 
account when evaluating the significance of a measured best- 
fit BF vs. the expectation from a ACDM universe. We now 
examine this bias separately in terms of the direction and 
amplitude of the BF. In the first subsection we show that 
the sparsity of our dataset causes a preference for a flow 
in directions within the galactic plane. In the second sub- 
section we show that the root-mean-square (rms) velocity 
typically used under the ACDM hypothesis is inappropriate 
for a significance estimation of the BF amplitude in a sparse 
dataset such as ours. 

3.1 Bulk flow - direction 

In a dense homogeneous dataset (which has no preferred di- 
rection), if we perform many random mixings of the sky loca- 
tions of the SNe, the best-fit BF direction will be distributed 
uniformly over the 4tt area of the sky. In a histogram of the 
measured directions, inside a circle of radius a around any 
sky coordinate we expect to find a fraction 

f(a) = sin 2 (a/2) (7) 

of the results. 

To demonstrate the bias induced by the sparsity of our 
dataset, we plot in Fig. [2] the ratios between the measured 



fraction and its expected value /meas.// for a uniformly dis- 
tributed set (Left) as well as for our dataset (Right), using 
a = 20°. We see that the result for our dataset is far from 
isotropic. Its spatial distribution, regardless of the observed 
magnitudes, is biased towards a specific portion of the sky, 
namely the region surrounding the galactic plane. 

3.2 Bulk flow - amplitude 

Another implication of the sparsity in the data is that the 
random component of the BF does not follow the expected 
ACDM behavior. We must therefore quantify the difference 
between the expected ACDM rms velocity and the rms ve- 
locity we expect when dealing with a sparse dataset such as 
Union2. 1. 

3.2.1 Velocity rms in ACDM 

In ACDM, the expected value for the BF amplitude is zero 
(v) = while its variance satisfies || 

al = <v ■ v) = ^f- 1 dkP(k) \W(kR)\ 2 , (8) 

where / = fi^ 55 is the dimensionless linear growth rate, 
P(k) is the matter power spectrum, W(kR) is the Fourier 
transform of a window function with characteristic scale R 
and the angle brackets (..) denote an ensemble average. Since 
ACDM is isotropic, this means that for each primary direc- 
tion i 6 {x,y,z} in a Cartesian coordinate system the BF 
amplitude may be described using a normal distribution 

Vi ~ Af(0, 0-A/V3), i = x,y,z. (9) 

To estimate the significance of a non-vanishing BF mea- 
sured in a given survey, a common approach is to tweak 
the frame of reference so that the BF points exactly in one 
of the primary directions, e.g. e y , and compare the "single 
component" measured BF to <xa/V3- However, this ignores 
the fact that the BF amplitude in the other two directions 
vanishes due to this particular choice of frame and would 
lead to an overestimated significance of the BF amplitude. 

To resolve this, we use the fact that the BF amplitude 
is a square-root of a sum of three normally distributed vari- 
ables | v| 2 = an d so in ACDM it satisfies 

ACDM: |v| ~ X 3(V3x/a A ). (10) 

That is, it follows an "unnormalized" \ distribution with 3 
degrees of freedom (a Maxwell-Boltzmann distribution) [Hoi ]. 
Eq. (|10[) represents the probability density function (PDF) 
of the BF amplitude inside some volume, that is modulated 
by the same window function as in Eq. (JS|, in an unbiased 
way. 

3.2.2 Velocity rms in Union2.1 

In order for the right-hand side of Eq. (|10p to describe the 
observed BF |v ob8 j appropriately, one needs to measure the 
peculiar velocity in many spatial locations, so that the typi- 
cal separation between any two nearest neighbors that were 
measured will be much smaller than the coherence scale. 
This is clearly not satisfied for the sparse Union2.1 dataset. 
Therefore we should replace the window function with a sum 
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Figure 2. Left: The ratio between the measured and expected fractions f meas ,/ f of 16, 000 randomly picked directions (simulating the 
homogeneous case) that point within a = 20° from each coordinate. Right: The same ratio inferred from 16, 000 random permutations 
of the locations of SNe of Union2.1 dataset, where the preference for the galactic plane is clearly seen. 



of N delta functions, each centered on the location Ri of a 
single SN 



1 



However, since in Fourier space 



(11) 



5(r - Ri) -> exp{-ik ■ R 4 }, 



(12) 

the new window function term will consist of ~ -/V 2 interfer- 
ence terms that are no longer spherically symmetric. There- 
fore using Eq. (|lll) to evaluate the sparse-case equivalent of 
Eq. (|10|) is unfeasible. Instead we use the amplitudes of the 
best-fit BF of 16, 000 random spatial permutations of our 
dataset, as described in £12.31 as an approximation of the 
PDF for the BF amplitude inside a sphere of radius z — 0.2. 
The difference between this approximation and the isotropic 
ACDM scenario will be encoded in a best-fit <jfi t (instead of 
ua) which describes the observed distribution 



Sparsity: |v° 



Xa(V3x/a m ). 



(13) 



In Fig. [3] we plot the approximated PDF for the BF 
amplitude and the corresponding best-fit X3 distribution ac- 
cording to Eq. p3)l. We find 



Cfit 



150 km/s 



(14) 



as opposed to a a = 43 km/s calculated directly from Eq. (JSJ) 
for a top-hat window function of size R = 550 Mpc/h. We 
see from Fig. [3] that a naive estimation of the significance 
of a measured BF amplitude in a sparse survey would be 
highly overestimated. 



4 SCATTER-BASED METHOD 

We present a method based on an iterative process of re- 
peatedly fitting a BF to the peculiar velocity field after the 
removal of the datapoint with the highest deviation from the 
previous fit. If there is a significant BF in the full dataset, 
the compactness of the scatter in the directions identified 
for the best-fit BF in each iteration can be used as an effi- 
cient estimator of the significance of the original flow. The 
stronger the flow in the full dataset, the smaller the scatter 
we will measure in the iterations. 



Random realizations 
Best fit xs(V3 x / a ftt) dist. 
ACDM X3(V3x/a A ) dist. 




100 150 200 250 300 350 400 450 

v [km/s] 

Figure 3. The normalized PDF for the amplitude of the BF for 
ACDM inside a sphere of radius 550 Mpc/h, calculated according 
to Eqs. (|9T) — (l 1 I t (black) as well as for random realizations given 
the sparsity of the Union2.1 data (blue). The red line is the best 
fit X3 ( V3x/cTfit ) according to 16, 000 random permutations of the 
locations of our dataset, as described in i|3,2l The goodness of fit 
is R 2 = 0.9945. 



4.1 Iterative algorithm 

Heuristically, the algorithm can be sketched as follows 
j, BF fitting — > residuals— > outlier ^ 



iterations 

After calculating the best-fit BF of the complete set accord- 
ing to Eq. ([5]), we examine the residual velocities of the 
different SNe in order to identify the one with the strongest 
deviation from the bulk. In each iteration, we find the best- 
fit Vgp and then identify the point i with the largest con- 
tribution 



A 2 / itcr\ 

Ax (v B f) 



(v r 



iter 

v B f • 



rV) 2 /(A<4) 2 



(15) 



and remove it from the dataset before the next iteration. By 
iteratively removing these deviants, we can also verify that 
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Figure 4. Illustration of the scatter of the best- fit BF direction in 
each iteration for a random realization with no inserted flow (blue) 
and for various sets with artificially inserted BF amplitudes, all 
in the direction (I, b) = (295°, 5°). The total number of iterations 
shown here is 191, after which only 10 SNe were left in the dataset. 
The score for each realization is shown in the appropriate color, 
normalized by the score of the real data. 

our results are not dominated by a small subset of the data 
with some common characteristic such as low redshift or a 
specific location on the sky. 

4.2 Scatter score 

To measure the scatter, we assign a scatter score to the data, 
defined by 

S — ^] arccos(fij ■ ni) + arccos(rij ■ fij— i), (16) 

which is a cumulative sum of the consecutive and total shifts, 
i.e. the sum of the distances from the direction of the best-fit 
BF in the current iteration rij to the one in the last iteration 
Aj-i and to that of the first iteration ni. This measures both 
the tightness and the extent of the scatter of the measured 
directions throughout the iterative process. 

In Fig.[4]we demonstrate the results for the scatter score 
S by plotting the directions of the best-fit BF at each iter- 
ation for a random realization with just a random BF and 
with increasing artificially-added BF amplitudes in the di- 
rection (l,b) — (295°, 5°), which is the direction of the best- 
fit BF of our dataset (this choice of inserted direction allows 
a straightforward comparison with the data and accounts 
for a possible bias, as mentioned in ij3.ll but we have ver- 
ified that it has no effect on our significance estimation). 
The results are consistent with the expected behavior: as 
the inserted BF amplitude is increased, the scatter becomes 
smaller and converges to a region closer to the inserted BF 
direction. 

4.3 Significance estimation 

We compare Sdata with the mean value S evaluated using 
6, 000 random simulations for each inserted BF amplitude 
vbf| and infer the significance of the data in terms of the 
probability that a random ACDM realization will get a lower 
score than the data 

V(S < Sdata) = [V(S< Sdata | | V | )V ( | V | )d | V | , (17) 



where PdvDdlvl is the probability that a ACDM realization 
of the data will have a BF of amplitude between |v| and 
v + d|v| given the sparsity of the data according to H3.21 
and V(S < Sdata | |v|) is the conditional probability that 
a random simulation will result in S < Sdata given a BF 
amplitude |v|. 

5 RESULTS 



Before applying our scatter-based method described in the 
last section, we note that using a naive best-fit, the overall 
BF in our dataset has an amplitude |vbf| = 260 km/s and 
points in the direction (I, b) = (295°, 5°), which is in agree- 
ment with results reported elsewhere 

[23I I2H I27I . [2g| | . This direction lies in proximity to features 
in the CMB (most of all to the giant rings reported in [3^]), 
but is also close to the galactic plane, as might have been 
expected given the sparsity bias shown in Fig. [2] In addi- 
tion, referring back to Fig. we see that when comparing 
with the expected rms amplitude in a finite-size survey with 
the same spatial distribution, this amplitude, although high, 
is consistent with ACDM at the 95% C.L. (naively using 
the unbiased ACDM expectation, one might have assigned 
a much larger significance to this result). 

Thus, using the full dataset, we conclude that no claim 
can be made as to the existence of a cosmological BF in 
the Union2.1 type-la SNe data up to redshift 2 = 0.2 given 
the significant bias induced by the poor sky coverage and 
redshift-volume density of this dataset. 

5.2 Sifting iteratively through the data 

The scatter of the best-fit BF direction measured in the 
iterative process described in SJJ is plotted in the left panel 
of Fig. [S] In the right panel we plot the luminosity distance 
to the excluded SN as a function of the iteration number, and 
demonstrate that our results are not dominated by a subset 
of nearby SNe (SNe at distances > 500 Mpc/h remain in the 
dataset until the final iterations). Comparing Figs. [4] (Left) 
and (Left) we see that the scatter of the data is much 
smaller than that of the realizations with |vbf| < 300 km/s, 
and is comparable in size to a |vbf| > 450 km/s realization. 

This is also shown quantitatively in Fig. \fi\(Left), where 
we plot the significance of the compactness of the scatter 
with respect to ACDM, as described by Eq. (|17[) . as a func- 
tion of the total number of iterations Na el . For Ni te i = 110 
(chosen arbitrarily), we show in Fig. [6] (Right) a few per- 
centiles of the results of the normalized score evaluated for 
random realizations of the data according to <|17|) . along with 
the data result. We see that the score for our dataset is out- 
side the 95% C.L. for any initial (i.e. for the whole dataset) 
BF amplitude smaller than 300 km/sec. 

Integrating over all possible initial BF amplitudes, we 
see that Sdata is surprisingly low with respect to the expec- 
tation from a ACDM universe: the overall probability that 
a single ACDM realization would get a score that is as low 
as the score of the real data is < 0.5% for any A?it er > 30, 
and gets as low as 0.1% for some choices of Mter- Thus we 
conclude that the scatter of the best fit BF direction is sig- 
nificantly low, at a > 99.5% C.L. 



5.1 Full dataset 
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Figure 5. Left: The scatter of the best-fit BF in the Union2.1 dataset at redshifts z < 0.2. The black stars mark the best-fit BF direction 
of each iteration until only 10 SNe are left in the dataset. The plus signs indicate the spatial coordinate of the SN that is excluded at 
each iteration and are coloured as a function of the iteration number (blue - excluded first, red - excluded last). Right: The distance to 
the excluded SN in each iteration as a function of the iteration number. The dashed line marks the distance to the farthest SN remaining 
in the dataset at each iteration. 




6 CONCLUSIONS 

The goal of this work was to use the most recent compilation 
of type-la SNe measurements in order to test the claims of 
a peculiar velocity BF in different studies. After truncating 
the Union2.1 catalogue at redshift z = 0.2 and extracting 
the radial peculiar velocity field, we showed that a naive at- 
tempt to measure a best-fit BF in this field ignores a signif- 
icant bias due to its sparse spatial distribution and renders 
inconclusive results for the amplitude and direction of the 
best-fit flow. This sparsity bias was discussed in detail above 
along with the difficulty in determining the correct ACDM 
prediction with which any result should be compared. We 
presented a prescription for estimating this value in a finite 
survey of given redshift extent and spatial distribution, and 
concluded that the BF amplitude measured in the Union2.1 
data up to redshift z — 2 is consistent with the 95% C.L. 
limits. 



Given the consistency in the reports from a wide spec- 
trum of analyses regarding the direction of the measured BF 
and the alignment between the reported values and certain 
CMB features, we focused on the direction and introduced a 
method which measures the scatter in the best-fit BF direc- 
tion as outlying points are removed in iterations. We were 
careful to use realistic expectations for a BF amplitude in a 
sparse dataset and used Monte Carlo simulations with simi- 
lar sparsity to estimate the significance of our findings. Our 
results suggest that the Union2.1 data up to redshift z = 0.2 
contains an anomalous BF at a 99.5% C.L. compared to ran- 
dom simulations with the same sparsity as the data. 

In the future, as more data is collected, the method 
used in this work will become more and more robust and 
enable the measurement of the BF in consecutive redshift 
bins to yield a better analysis of the redshift dependence 
of the measured result. In addition, it might be possible to 
focus on measurements from a single survey and thus reduce 
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the errors stemming from combining several surveys with 
different characteristics. 

If the reports of a BF which is inconsistent with ACDM 
are verified by future observations, it shall serve as a promis- 
ing lead for theoretical research exploring areas beyond the 
concordance cosmological model. The full potential of type- 
la SNe data to settle this issue is yet to be realized. 
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